1,933 research outputs found
Applications of topological data analysis to natural language processing and computer vision
2022 Spring.Includes bibliographical references.Topological Data Analysis (TDA) uses ideas from topology to study the "shape" of data. It provides a set of tools to extract features, such as holes, voids, and connected components, from complex high-dimensional data. This thesis presents an introductory exposition of the mathematics underlying the two main tools of TDA: Persistent Homology and the MAPPER algorithm. Persistent Homology detects topological features that persist over a range of resolutions, capturing both local and global geometric information. The MAPPER algorithm is a visualization tool that provides a type of dimensional reduction that preserves topological properties of the data by projecting them onto lower dimensional simplicial complexes. Furthermore, this thesis explores recent applications of these tools to natural language processing and computer vision. These applications are divided into two main approaches: In the first approach, TDA is used to extract features from data that is then used as input for a variety of machine learning tasks, like image classification or visualizing the semantic structure of text documents. The second approach, applies the tools of TDA to the machine learning algorithms themselves. For example, using MAPPER to study how structure emerges in the weights of a trained neural network. Finally, the results of several experiments are presented. These include using Persistent Homology for image classification, and using MAPPER to visual the global structure of these data sets. Most notably, the MAPPER algorithm is used to visualize vector representations of contextualized word embeddings as they move through the encoding layers of the BERT-base transformer model
A Multiple-Objective Decision Analysis of Stakeholder Values to Identify Watershed Improvement Needs
The paper describes the use of multiple objective decision analysis to qualitatively and quantitatively assess the quality of an endangered watershed and guide future efforts to improve the quality of the watershed. The Upham Brook watershed is an urban watershed that lies at the interface of declining inner city Richmond, Virginia and growth-oriented Henrico County. A section of stream within the watershed has been identified as so dangerously polluted that it threatens the health of the residents who live within the watershed boundaries. With funding provided by the National Science Foundation, the Upham Brook watershed project committee was formed to address the quality of the Upham Brook watershed; it consisted of experts from multiple disciplines: stream ecology, environmental policy, water policy, ground and surface water hydrology and quality, aquatic biology, political science, sociology, citizen participation, community interaction, psychology, and decision and risk analysis. Each members\u27 values and goals were brought together using a watershed management framework to meet the overall objective of the committee: to maximize the quality of the Upham Brook watershed. The resulting model was used to identify the largest value gaps and to identify future programs needed to improve the quality of the watershed
Near-Infrared Photometric Survey of Proto-Planetary Nebula Candidates
We present JHK' photometric measurements of 78 objects mostly consisting of
proto-planetary nebula candidates. Photometric magnitudes are determined by
means of imaging and aperture photometry. Unlike the observations with a
photometer with a fixed-sized beam, the method of imaging photometry permits
accurate derivation of photometric values because the target sources can be
correctly identified and confusion with neighboring sources can be easily
avoided. Of the 78 sources observed, we report 10 cases in which the source
seems to have been misidentified or confused by nearby bright sources. We also
present nearly two dozen cases in which the source seems to have indicated a
variability which prompts a follow-up monitoring. There are also a few sources
that show previously unreported extendedness. In addition, we present H band
finding charts of the target sources.Comment: 3 tables, 1 figur
Cellular Ser/Thr-Kinase Assays Using Generic Peptide Substrates
High-throughput cellular profiling has successfully stimulated early drug discovery pipelines by facilitating targeted as well as opportunistic lead finding, hit annotation and SAR analysis. While automation-friendly universal assay formats exist to address most established drug target classes like GPCRs, NHRs, ion channels or Tyr-kinases, no such cellular assay technology is currently enabling an equally broad and rapid interrogation of the Ser/Thr-kinase space. Here we present the foundation of an emerging cellular Ser/Thr-kinase platform that involves a) coexpression of targeted kinases with promiscuous peptide substrates and b) quantification of intracellular substrate phosphorylation by homogeneous TR-FRET. Proof-of-concept data is provided for cellular AKT, B-RAF and CamK2ÎŽ assays. Importantly, comparable activity profiles were found for well characterized B-Raf inhibitors in TR-FRET assays relying on either promiscuous peptide substrates or a MEK1(WT) protein substrate respectively. Moreover, IC50-values correlated strongly between cellular TR-FRET assays and a gold standard Ba/F3 proliferation assay for B-Raf activity. Finally, we expanded our initial assay panel by screening a kinase-focused cDNA library and identified starting points for >20 cellular Ser/Thr-kinase assays
Constraints on Type Ib/c and GRB Progenitors
Although there is strong support for the collapsar engine as the power source
of long-duration gamma-ray bursts (GRBs), we still do not definitively know the
progenitor of these explosions. Here we review the current set of progenitor
scenarios for long-duration GRBs and the observational constraints on these
scenarios. Examining these, we find that single-star models cannot be the only
progenitor for long-duration GRBs. Several binary progenitors can match the
solid observational constraints and also have the potential to match the trends
we are currently seeing in the observations. Type Ib/c supernovae are also
likely to be produced primarily in binaries; we discuss the relationship
between the progenitors of these explosions and those of the long-duration
GRBs.Comment: 36 pages, 6 figure
First radial velocity results from the MINiature Exoplanet Radial Velocity Array (MINERVA)
The MINiature Exoplanet Radial Velocity Array (MINERVA) is a dedicated
observatory of four 0.7m robotic telescopes fiber-fed to a KiwiSpec
spectrograph. The MINERVA mission is to discover super-Earths in the habitable
zones of nearby stars. This can be accomplished with MINERVA's unique
combination of high precision and high cadence over long time periods. In this
work, we detail changes to the MINERVA facility that have occurred since our
previous paper. We then describe MINERVA's robotic control software, the
process by which we perform 1D spectral extraction, and our forward modeling
Doppler pipeline. In the process of improving our forward modeling procedure,
we found that our spectrograph's intrinsic instrumental profile is stable for
at least nine months. Because of that, we characterized our instrumental
profile with a time-independent, cubic spline function based on the profile in
the cross dispersion direction, with which we achieved a radial velocity
precision similar to using a conventional "sum-of-Gaussians" instrumental
profile: 1.8 m s over 1.5 months on the RV standard star HD 122064.
Therefore, we conclude that the instrumental profile need not be perfectly
accurate as long as it is stable. In addition, we observed 51 Peg and our
results are consistent with the literature, confirming our spectrograph and
Doppler pipeline are producing accurate and precise radial velocities.Comment: 22 pages, 9 figures, submitted to PASP, Peer-Reviewed and Accepte
An Inflationary Scenario in Intersecting Brane Models
We propose a new scenario for D-term inflation which appears quite
straightforwardly in the open string sector of intersecting brane models. We
take the inflaton to be a chiral field in a bifundamental representation of the
hidden sector and we argue that a sufficiently flat potential can be brane
engineered. This type of model generically predicts a near gaussian red
spectrum with negligible tensor modes. We note that this model can very
naturally generate a baryon asymmetry at the end of inflation via the recently
proposed hidden sector baryogenesis mechanism. We also discuss the possibility
that Majorana masses for the neutrinos can be simultaneously generated by the
tachyon condensation which ends inflation. Our proposed scenario is viable for
both high and low scale supersymmetry breaking.Comment: 30 pages, 2 figures; v2 references and comments adde
Continuous flexibility analysis of SARS-CoV-2 spike prefusion structures
Using a new consensus-based image-processing approach together with principal component analysis, the flexibility and conformational dynamics of the SARS-CoV-2 spike in the prefusion state have been analysed. These studies revealed concerted motions involving the receptor-binding domain (RBD), N-terminal domain, and subdomains 1 and 2 around the previously characterized 1-RBD-up state, which have been modeled as elastic deformations. It is shown that in this data set there are not well defined, stable spike conformations, but virtually a continuum of states. An ensemble map was obtained with minimum bias, from which the extremes of the change along the direction of maximal variance were modeled by flexible fitting. The results provide a warning of the potential image-processing classification instability of these complicated data sets, which has a direct impact on the interpretability of the results.The authors would like to acknowledge financial support from
CSIC (PIE/COVID-19 No. 202020E079), the Comunidad de
Madrid through grant CAM (S2017/BMD-3817), the Spanish
Ministry of Science and Innovation through projects SEV
2017-0712, FPU-2015/264 and PID2019-104757RB-I00/AEI/
FEDER, the Instituto de Salud Carlos III [PT17/0009/0010
(ISCIII-SGEFI/ERDF)], and the European Union and
Horizon 2020 through grants INSTRUCTâULTRA
(INFRADEV-03-2016-2017, Proposal 731005), EOSC Life
(INFRAEOSC-04-2018, Proposal 824087), HighResCells
(ERC-2018-SyG, Proposal 810057), IMpaCT (WIDESPREAD-
03-2018, Proposal 857203), CORBEL
(INFRADEV-1-2014-1, Proposal 654248) and EOSCâSynergy
(EINFRA-EOSC-5, Proposal 857647). HDT and BF were
supported by NIH grant GM125769 and JSM was supported
by NIH grant R01-AI12752
Tuning MPL signaling to influence hematopoietic stem cell differentiation and inhibit essential thrombocythemia progenitors
Thrombopoietin (TPO) and the TPO-receptor (TPO-R, or c-MPL) are essential for hematopoietic stem cell (HSC) maintenance and megakaryocyte differentiation. Agents that can modulate TPO-R signaling are highly desirable for both basic research and clinical utility. We developed a series of surrogate protein ligands for TPO-R, in the form of diabodies (DBs), that homodimerize TPO-R on the cell surface in geometries that are dictated by the DB receptor binding epitope, in effect "tuning" downstream signaling responses. These surrogate ligands exhibit diverse pharmacological properties, inducing graded signaling outputs, from full to partial TPO agonism, thus decoupling the dual functions of TPO/TPO-R. Using single-cell RNA sequencing and HSC self-renewal assays we find that partial agonistic diabodies preserved the stem-like properties of cultured HSCs, but also blocked oncogenic colony formation in essential thrombocythemia (ET) through inverse agonism. Our data suggest that dampening downstream TPO signaling is a powerful approach not only for HSC preservation in culture, but also for inhibiting oncogenic signaling through the TPO-R
- âŠ